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ABSTRACT 

In  this  paper  we  present  a  new  technique  for  analyzing 
the  control  flow  of  a  computer  program.   This  technique,  called 
structural  analysis,  extends  new  interval  analysis  techniques 
and  produces  a  program  representation  in  which  structured 
control-flow  patterns  are  detected  and  recorded.   This  repre- 
sentation supports  data-flow  analysis  elimination  techniques 
similar  to  Rosen's  high-level  data-flow  analysis  technique, 
which  are  faster  than  interval-based  methods.   Moreover,  these 
results  indicate  that  flow-graph  based  program  analysis  and 
direct  analysis  of  the  program's  parse  tree  can  be  performed 
by  essentially  the  same  methods,  making  uniform  data-flow 
analysis  procedure  for  optimizing  compilers  possible. 


1 .  Introduction 

In  this  paper  we  present  a  new  approach  to  prograin  flow 
analysis.   It  extends  the  improved  interval  analysis  and  interval- 
based  data-flow  analysis  methods  of  Schwartz  and  Sharir  [SS] 
in  the  sense  that  it  also  detects  standard  structured  control-flow 
patterns,  such  as  ' if-then-else ' ,  ' begin-end ' ,  'while-do'  etc. 
and  then  uses  explicit  formulae  of  Rosen's  kind  [Ho-,]  to  analyze 
the  data-flow  within  these  structures. 

Assuming  a  standard  intermediate-level  code  representation 
of  the  program  to  be  analyzed  (i.e.  a  flow-graph  consisting  of 
basic  blocks  [He],  [AU]),  our  approach  begins  with  a  structural 
analysis  phase.   This  is  a  generalized  interval  analysis  step 
which  uses  a  depth-first  spanning  tree  to  construct  a  hierarchical 
tree  structure  of  flow  subgraphs  imbedded  within  each  other. 
These  subgraphs  (control  structures)  can  be  either  the  standard 
control-flow  patterns  mentioned  above,  or  else  general  intervals 
(with  or  without  imbedded  irreducibilities)  of  the  kind  described 
in  [SS].   If  the  program  to  be  analyzed  is  well-structured,  then 
our  structural  analysis  will  uncover  most  of  the  program's 
original  control  structures  and  can  therefore  be  regarded  as 
an  improved  'unparser'  of  the  program  control  flow.   Unlike 
the  graph-grammar  approach  of  [FKZ]  our  method  can  process 
any  flow  graph,  tolerates  irreducibilities  and  'localizes' 
them  as  in  [SS],  but  in  addition  its  performance  will  improve 
as  the  programs  to  be  analyzed  become  more  well  structured. 
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A  similar  structural  analysis  technique  has  been  described  by 
Baker  [Ba] ,  though  only  for  the  purpose  of  source-to-source 
program  transformation;  our  algorithm  is  relatively  simpler 
and  handles  nonstructured   control  flow  patterns  in  a  way 
more  suitable  for  data  flow  analysis. 

The  benefits  of  this  type  of  structural  analysis  are  as 
follows:  (i)  It  facilitates  the  use  of  elimination  techniques 
which  use  explicit  data-flow  formulae  of  Rosen's  kind  [Ro-,  ] 
for  subsequent  data-flow  analysis.   Our  technique  generalizes 
similar  interval-based  techniques  in  [SS],  but  turns  out  to  be 
faster  for  well  structured  programs. 

(ii)  Our  techniques  indicate  that  flow-graph  based  program 
analysis  and  direct  analysis  of  the  program's  parse-tree,  as 
considered  by  Rosen  [Ro^ ]  and  in  a  completely  context  by  Wulf 
et  al.[Wu],  are  not  as  different  as  they  may  seem,  but  that 
uniform  program  analysis  procedures  can  handle  either  kind  of 
representation  by  essentially  the  same  methods.   This  should 
lead  to  a  uniform  discipline  of  program  flow  analysis  in  optimizing 
compilers  which  might  discard  the  use  of  a  flow  graph  repre- 
sentation altogether,  and  instead  work  from  a  representation 
similar  to  the  program  parse-tree  which  is  easily  produced  by 
the  compiler  front-end.  To  sustain  this  claim,  we  note  in  this 
paper  that  various  frequently  encountered  pragmatic  optimiza- 
tions, such  as  interprocedural  analysis,  forward  vs.  backward 
analysis,  code  motion  and  code  sinking,  and  also  analyses  which 
cannot  be  performed  using  elimination  techniques,  can  all  be 
performed  using  such  a  program  representation,  mainly  by 
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generalizing  the  methods  of  [SS]  to  such  a  hierarchical  repre- 
sentation, in  a  way   similar  to  that  by  which  intraprocedural 
forward  analysis  is  handled  in  this  paper.  If,  however,  flow- 
graph  representation  is  preferred  for  some  other  reasons,  then 
the  structural  analysis  techniques  described  in  this  paper  can 
be  used  to  allow  high-level   data-flow  analysis  to  be  performed 
on  the  flow  graph  with  the  same  advantages  as  those  yielded  by 
parse- tree  based  analysis. 

Our  structural  analysis  extends  an  improved  interval  analysis 
technique  described  in  [SS],  based  on  a  graph-reducibility  testing 
algorithm  of  Tarjan  [Ta] .   Our  technique  and  notations  are 
closely  related  to  those  of  [SS],   Also,  as  in  [SS],  we  do  not 
attempt  to  produce  an  almost-linear  algorithm,  as  is  Tarjan 's 
original  algorithm;  this  may  complicate  the  algorithm,  which, 
for  most  typical  programs  will  be  linear  anyway. 

The  paper  is  organized  as  follows:   Section  2  describes 
our  structural  analysis  algorithm.   Section  3  sketches  an  elimination 
algorithm  for  data-flow  analysis  based  on  these  structural  analysis 
techniques  and  comments  on  the  performance  of  this  algorithm. 
Concluding  comments  follow  in  section  4. 

2 .  Structural  Analysis 

Suppose  that  a  program  to  be  analyzed  is  represented  as 
a  flow-graph  G  with  node  set  N,  entry  node  r  and  set  of  edges  E. 
Each  node  n  e  n  represents  a  basic  block  of  intermediate-level 
code,  i.e.  a  single-entry  single-exit  sequence  of  instructions. 
For  simplicity  we  assume  that  the  program  to  be  analyzed  contains 
no  subprocedures  (or  that  we  are  analyzing  a  single  procedure  at 
a  time  and  ignore  interprocedural  flow) . 
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Assuming  that  the  procrara  to  be  analyzed  is  likely  to  be 
relatively  well-structured,  our  aim  is  to  transform  G  into  a 
hierarchy  of  small  flow  graphs  each  belonging  to  some  familiar 
class  of  relatively  simple  control  structures  expected  to  occur 
in  the  program,  such  as ' if-then-else ' ,  'begin-end',  'while-do'  etc, 
In  other  words,  we  want  to  parse  the  flow  graph  G  into  its  under- 
lying structures.   However,  unlike  the  graph-grammar  approach  of 
[FKZ],  our  method  does  not  require  the  hierarchical  decomposition 
to  consist  solely  of  graphs  having  special  structure,  but  can 
tolerate  the  'bad'  flow  which  may  result  from  occasional  explicit 
'goto'  statements  occuring  in  the  program  being  analyzed. 

Vie   assume  basic  blocks  to  be  single-exit  code  segments. 
However,  they  can  be  combined  into  more  complex  flow  structures 
which  need  not  necessarily  have  a  single  exit.   Here  an  important 
observation  is  that  if  'abnormal'  exits  are  allowed  in  the 
control  structures  we  set  out  to  find,  significant  control-flow 
'ambiguities'  may  be  encountered.   As  an  example,  consider  the 
following  flow  graph  G 


F  0 

Here  the  subgraph  G'  consisting  of  the  nodes  C,  D,  E  can  be 
interpreted  as  an  if-then-else  structure,  having  an  'abnormal' 
exit  from  E  to  A  (F  is  the  'natural'  exit  of  this  structure). 
But  if  we  analyze  G'  in  this  way,  and  'collapse'  it  to  a  single 
node  C* ,    we  will  obtain  the  following  reduced  graph 
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This  forces  us  to  regard  the  block  D  as  part  of  the  loop  starting 

at  A,  which  is  usually  undesirable   since,  for  example,  it 
complicates  code  motion  because  we  certainly  do  not  wish  to  apply 
the  same  profitability  criterion  to  code  motion  from  D  out  of  that 
loop  as  to  code  motion  from  C  or  E  out  of  that  loop. 

For  the  reasons  which  this  example  illustrates  we  will  tend 
to  look  suspiciously  at  structures  having  abnormal  exits,  and 
will  therefore  concentrate  on  single-entry  single-exit  explicit 
structures.   Only  when  such  structures  do  not  suffice  will  we 
allow  general  loops  (intervals)  to  have  multiple  exits;  such 
loops  will  then  be  reduced  without  associating  any  well-structured 
flow  pattern  with  them.  (This  is  quite  different  from  the  approach 
of  [Ba]  ,  where  standard  control  struct\ires  with  abnormal  exits  are 
allowed. ) 

The  flow  patterns  that  our  structural  analysis  aims  to 
detect  are  as  follows  (each  pattern  is  given  a  mnemonic   code) : 

(1)  'BLOCK':   sequential  'begin-end'  block  of  the  form 

1     z       n 
where  A, , . . . , A  are  single-entry  single-exit   structures  and  for 

each  j  <  n  A.  is  the  only  predecessor  of   A.  ,  and  A._|_,  is  the 

only  successor  of  A . . 

J 

(2)  'IFTHEN':   if  A^  then  A2;  end  if; 

where  A,,  A^  are  single-entry  single-exit  structures.  A-  has  A, 
as  its  only  predecessor  and  has  only  one  successor  which  is  also 
the  other  successor  of  A,  (which  has  t'A'o  successors)  . 


(3)  ' IFTHENELSE • :    if  A^  then  A^ ;  else  A^ ;  end  if; 

where  A,,  A^,  A^  are  single-entry  single-exit  structures,  A_   and 
A^  are  the  only  successors  of  A,  ,  both  A„  and  A^  have  A,  as  their 
only  predecessor,  and  both  have  only  one  and  the  same  successor. 
(For  simplicity,  we  ignore  'case'  statements.) 

(4)  'WHILELOOP':    while  A^  do  h^: 

where  A,,  A„  are  single-entry  single-exit  structures,  A^  has  A, 

as  its  only  predecessor  and  successor,  and  A,  has  one  other  successor. 

(5)  'LOOP':    a  self-loop;  consists  of  one  single-entry  single- 
exit  structure  A,  which  has  itself  as  one  of  its  successors. 

(6)  'PROPINT':   a  general  proper  strongly-connected  interval 
(i.e.  a  strongly  connected  graph  I  having  a  single  entry  node 

X  €  I,  v^hich  also  has  the  property  that  all  cycles  within  I  pass 
through  x.) 

(7)  'IMPROPINT':   a  general  improper  strongly  connected  interval 
(a  single-entry  strongly-connected  subgraph  containing  multiple- 
entry  subcycles) ;  see  [SS]  for  a  detailed  account  of  the  various 
interval  structures  considered  here. 

(8)  ' PROPOUTINT ' :   a  general  (single-entry)  acyclic  flow  graph 
(a  proper  'outermost  interval'  of  [SS]) 

(9)  ' IMPROPOUTINT ' :  a  general  (single-entry)  graph  containing 
only  multiple-entry  cycles  (improper  outermost  interval  of  [SS])  . 
Note  that  all  these  structures  are  single-entry,  which  is  essential 
for  our  algorithm  and  subsequent  data-flow  analysis. 
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Next  we  sketch  our  structural  analysis  algorithm. 

I.  Construct  a  depth-first  spanning  tree  T  for  G. 

II.  Initialization  Step:   Control  structures  detected  by  our 
algorithm  are  inserted  into  the  graph  as  new  nodes.   If  x  is  the 
entry  node  of  an  identifiable  structure  S  ,  then  (a  node  identifying) 
S   is  inserted  into  G  in  the  following  way: 

(a)  All  edges  entering  x  from  outside  S   are  replaced  by 
edges  entering  S  . 

(b)  For  each  exit  edge  {u,v)  of  S  ,  we  add  a  'virtual' 
edge  (S^,  v)  . 

(c)  S   is  inserted  into  T  in  the  same  place  of  x  and  replaces 
X  during  subsequent  graph  reductions.   Note  that  there 

is  no  need  to  rebuild  a  depth-first  spanning  tree  for 
the  reduced  graph. 
Note  also  that  the  hierarchical  flow  graphs  that  we  construct 
are  all  'blended'  into  one  flow  graph. 

Our  algorithm  also  produces  the  following  data-structures, 
which  are  all  initialized  in  this  phase  to  be  empty. 

STRUCTOF:   maps  each  node  in  G  to  the  structure  node  which   immediate!' 

contains  it. 
STRUCTYPE:  maps  each  structure  node  to  its  structural  type. 
STRUCTURES:   list  of  all  structure  nodes  in  their  postorder.  (This 

is  an ' innermost  to  outermost'  order  of  structures.) 
STRUCNODES :   maps  each  structure  node  to  the  list  of  its  nodes 
in  relative  reverse  postorder. 
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III.   We  iterate  through  the  nodes  of  the  graph  in  postorder 
(relative  to  T) .   Let  x  be  the  current  node.   We  first  test  whether 
X  is  an  entry  node  of  one  of  the  simple  acyclic  structures  (l)-(3) 
in  the  list  of  control  structures  given  above.   If  x  is  such  an 
entry  node,  then  we  create  a  new  node  S  ,  insert  it  into  the  graph 
in  the  manner  explained  above,  add  S   to  the  list  STRUCTURES,  set 
STRUCTYPE(S  )  to  the  type  of  the  control  structure  detected,  set 
STRUCTOF(y)  to  S   for  all  y  in  the  structure  S  ,  and  set  STRUCNODES (S  ) 

X  XX 

to  the  list  of  all  the  y's  constituting  S  ,  arranging  these  y's 

in  their  reverse  postorder.   'Begin-end'  blocks  require  special 

treatment:   If  A,,  A„,...,  A   form  such  a  block,  then  they  will 

1    z        n 

be  processed  in  the  order  A  ,  A   ,,..., A,  .   Without  special  precaution, 

■^  n    n-1        1  r-  r- 

for  each  j  <  n  A.  might  be  found  to  be  an  entry  node  of  such  a 
block  and  the  algorithm  would  wind  up  creating  n-1  nested  two-node 
blocks  from  this  sequence.   To  avoid  this  we  modify  our  analysis 
of  such  blocks  as  follows:   When  processing  a  node  x  we  test 
whether  x  has  only  one  successor  y  which  has  x  as  its  only  pre- 
decessor; if  so  and  if  y  is  itself  a  (node  identifying) 'begin-end' 
block,  then  we  simply  append  x  to  the  start  of  that  block,  delete 
y  from  the  graph  (undoing  all  the  actions  taken  to  insert  y  into 
the  graph)  and  obtain  a  new  'begin-end'  block  starting  at  x  which 
is  then  inserted  into  the  graph  as  before. 

If  X  has  not  been  found  to  be  an  entry  of  such  a  structure, 
we  determine  whether  x  is  the  target  of  a  back-edge.   If  so,  we 
construct  the  set 
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REACHUNDER(x)  =  {y  a  node  in  G  such  that  STRUCTOF(y)  is 

undefined  and  x  can  be  reached  from  y  along 
a  path  not  going  through  x  whose  final  edge 
is  a  back  edge}   u  {x} 
as  in  [Ta],  [SS].   The  following  cases  can  occur: 

(a)  #REACHUNDER(x)  =  1.  X  forms  a  self-loop,  i.e.  the  structure 
'LOOP'.    Reduce  {x}  as  such,  in  the  manner  explained  above. 

(b)  #REACHUNDER(x)  =  2   and  the  other  member  y  of  this  set  has 
X  as  its  only  successor.   Then  {x,y}  form  a  'WHILELOOP'. 
Reduce   {x,y}  as  such,  in  the  manner  explained  above. 

(c)  REACHUNDER(x)   contains  a  nondescendant  node  of  x.   Then 
X  is  an  entry  to  a  multiple-entry  cycle.    Mark  x  as  such 
and  reduce  nothing.  (This  case  and  the  following  one  are 
descussed  in  [SS].) 

(d)  If  none  of  the  above  cases  is  detected,  then  I  =  REACHUNDER (x) 
forms  an  interval  (in  the  terminology  of  [SS])  with  entry 
node  x.   If  I  contains  entries  to  multiple-entry  loops  then 

I  is  of  type  ' IMPROPINT ' ,  otherwise  I  is  of  type  'PROPINT'. 

Reduce  I  accordingly  in  the  manner  described  above. 

If  no  reduction  has  taken  place,  then  we  go  on  to  process 
the  next  node  in  postorder .   Otherwise  the  new  structure  node  S^ 
is  logically  added  to  T  at  the  same  place  of  x,  and  S^  will  be 
the  next  node  to  be  processed  in  this  phase. 
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IV.    When  iterative  application  of  step  III  has  terminated,  the 
final  reduced  graph  G'  (consisting  of  those  nodes  y  for  which 
STRUCTOF(y)  is  still  undefined)  will  be  either  a  singleton  (for 
well-structured  programs) ,  or  an  acyclic  graph  (a  structure  of 
type  'PROPOUTINT' )  if  it  does  not  contain  multiple-entry  cycles, 
or  else  a  structure  of  type  ' IMPROPOUTINT ' .   If  G'  is  not  a 
singleton,  a  final  reduction  takes  place  according  to  the  type 
of  G'  . 
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Examples : 

(1)    First  consider  a  well-structured  program 

begin 

while  A  do 
begin 

if  B  then 
begin 
C 
while  D  do 

if  E  then  F  else  G  end  if 

H 
end 
end  if 

repeat  I  until  J 
end 
K 
L 
end 

Here  A,  B,  ...   are  basic  blocks 
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The  corresponding  flow-graph,  represented  as  a  depth-first  spanning 
tree  with  nontree  edaes  added  as  dotted  edges,  is 


A 

/   '   \ 
/    ,B     K 

'    /  I       I 

/     '    C     L 
/ 

/     ,      I 


/    / 

/ 

/ 


el   i 

/    '  /  '   E   I 


^    1" 


N    I 


.'-J 


Accordingly,  the  node  postorder  is   JIHFGEDCBLKA. 

The  following  table  summarizes  the  analysis  that  would  be 
performed  in  step  III  of  the  algorithm  (for  notational  simplicity, 
S   is  denoted  as  x   in  the  table) .   Note  that  all  the  control 

X 

structures  used  in  the  program  are  detected  by  our  algorithm, 
except  that  the  '  repeat-until '  loop  {l,j}  is  analyzed  as  a  self- 
loop  containing  a  block;  this  happens  because  there  is  no  logical 
separation  between  the  end  of  the  'loop-body'  I  and  the  beginning 
of  the  'loop-test'  J. 
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Node  Processed  Structure  Detected 


J 

I 

I* 

J  ** 

H 

F 

G 

E 

E* 

D 

D* 

D** 

C 

c* 

B 

B* 

B** 

L 

K 

K* 

A 

A* 

none 

I,  J  form  a  'BLOCK '  I* 

I*  forms   a  'LOOP'   I** 

none 

none 

none 

none 

E,  F,  G  form  an  ' IFTHENELSE '  E* 

none 

D,  E*  form  a  'WHILELOOP'   D* 

D*,  H  form  a  'BLOCK'  D** 

none 

C,  D**  form  a  'BLOCK';  after  deleting 
D**,  C,    D*,  H  form  a  'BLOCK'  C* 

none 

B,  C*  form  an  ' IFTHEN '  B* 

B*,  I**  form  a  'BLOCK'  B** 

none 

none 

K,  L  form  a  'BLOCK'   K* 

none 

A,  B**  form  a  'WHILELOOP'  A* 

A*,  K*  form  a  'BLOCK';  after  deletina 
K*,  A*,  K,  L  form  a  'BLOCK'  A** 


i**  none 
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Fig.  I:  graph  after  reduction  of  I' 


Fig.  II:  graph  after  reduction  of  E* 
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/    B    K 


B      K 


C*   L 
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Fig.  IV: graph  after  reduction  of  C* 


\       H 
■-  ,.  I** 
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\B' 


Fig.  Ill:   graph  after  reduction  of  D'^ 


Fig.  V:  graph  after  reduction  of  B** 
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(2)    Next  we  consider  a  well -structured  program  with  an  exit 
command  within  a  loop. 

begin 

while  A  do 
if  B  then 
begin 
C 

exit  while 
end 
else 
D 
end  if 
E 
end 


The  corresponding  flow  graph  is 


and  the  node  postorder  is   E  C  D  B  A. 

The  following  table  shows  the  course  of  step  III  of  our  flow 

analysis  algorithm. 
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Node  Processed  Structure  Detected 

E  none 

C  none 

D  none 

B  none 

A  A,  B,  D  form  a  'PROPINT'   A* 

A*  A*,  C    form  an  ' IFTHEN '   A** 

A**  A**,  E   form  a  'BLOCK'     A*** 

Since  we  do  not  allow  abnormal  exits  in  structures  other  than 
general  intervals,  our  algorithm  does  not  detect  {B,  C,  D}  as 
an  ' if-then-else '  structure,  and  consequently  places  C  outside 
of  the  loop  A*.   However  this  choice  is  superior  to  the  standard 
interpretation  of  the  program  code  in  this  example,  since  C  does 
not  really  belong  to  the  strongly  connected  part  of  the  while 
loop.   Note  that  the  'if-then'  structure  A**  can  read  heuristically 


as 


if  (the  loop  A*  has  terminated  via  the  abnormal  exit) 
then  C  end  if 
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3 .  Data-flow  Analysis  for  Structural  Graphs 

Next  we  will  describe  an  algorithm  for  applying  data-flow 
analysis  of  the  bitvectoring  class  to  the  kind  of  structural  flow 
graphs  produced  by  the  analysis  described  in  the  previous  section. 
The  algorithm  to  be  described  uses  elimination  techniques  and 
blends  and  generalizes  similar  algorithms  of  Rosen  [ROp]  and 
Schwartz  and  Sharir  [SS],   We  follow  the  notations  of  [SS]  closely. 
For  simplicity  we  will  initially  only  consider  intraprocedural 
forward  data-flow  problems,  i.e.  will  assume  (as  we  did  in  the 
previous  section)  that  the  program  being  analyzed  involves  no 
interprocedural  flow,  and  also  assume  that  the  data  we  are  seeking 
is  to  be  propagated  in  the  direction  of  execution  flow. 

Although  Rosen's  high-level  data-flow  analysis  technique 
[ROp]  is  applicable  to  more  general  data-flow  analyses  (those 
involving  'rapid  monoids',  in  the  terminology  of  [ROp]),  we 
restrict  ourselves  to  the  bitvectoring  case,  since  it  is  both 
the  simplest  and  the  most  likely  to  occur  in  optimizing  compilers. 
(Moreover,  the  technique  we  describe  can  be  easily  generalized  to 
rapid  data-flow  analysis  problems.)   For  the  sake  of  completeness 
we  begin  by  summarizing  the  notations  of  [SS]  used  to  describe 
our  algorithm. 

A  bitvectoring  data-flow  analysis  is  performed  using  a 
framework  (L,F) ,  where  L  is  a  lattice  having  the  form  2   for 
some  finite  set  E  (elements  of  L  are  represented  as  bitvectors 
over  E) ,  where  meet  in  L  is  either  set-intersection  or  set-union. 
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depending  on  the  type  of  data  being  sought  (here  it  is  assumed 

to  be  set-intersection) ,  and  where  ?  is  a  space  of  data-propagation 

maps  acting  on  L,  where  each  f  s  f  has  the  form 

f  (x)  =  (A^  f^  x)  U  B^,  X  e  L 

for  two  subsets   A_^  ^  B^.  of  E.   Heuristically,  elements  of  L 

f     f 

describe  boolean  data-states  at  a  program  point  (such  as  availability 

of  expressions,  liveness  of  variables,  reachability  of  variable 

occurrences  etc.),  and  elements  of  F  represent  transformations  of 

these  states  as  effected  by  code  execution.   It  is  well  known 

that  for  bitvectoring  frameworks  elements  of  F  are  distributive, 

idempotent  and  can  be  compactly  represented  as  elements  of  L  x  L; 

also  F  contains  the  identity  map  ±d_   on  L  and  is  closed  under 

functional  compositions  and  meets,  which  can  be  fastly  performed 

using  bitvector   .ANTD  and  OR  operations  (functional  composition 

requiring  four  such  operations  and  functional  meet  only  two) . 

Once  having  introduced  such  a  bitvectoring  framework, 

we  can  associate  with  each  edge  (m,n)  in  the  original  flow  graph 

G,  a  data-flow  map  f,         %  ^  F  describing  the  effect  on  L  as 

^    (m,n) 

control  advances  from  the  start  of  the  basic  block  m  through  m 
to  the  start  of  the  adjacent  block  n.   Then,  if  x^  e  l  denotes 
null  information  to  be  assumed  at  the  start  of  execution  (i.e. 
the  program  entry  point  r) ,  and  if  x  e  l  denotes  the  data 
state  to  be  computed   at  entry  to  a  basic  block  n,  we  want  to 
solve  the  following  set  of  standard  data-flow  equations 
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(*)  X   <  X 

r  —  o 


X   <  A  {f ,    .  (x^)  :   (m,n)  G   G},     n  e  n 


Since  bitvectoring  frameworks  are  distributive,  it  is  well  known 
[Ki]  that  the  maximal  fixpoint  of  these  equations  is  equal  to 
the  'meet-over-all-paths'  solution,  i.e.  that  for  each  n  e  N 

x  =A{f(x):pa  path  leading  from  r  to  n} 
n       p   o    -^   -^  ^ 

where  for  each  path  p  =  (n, , n^,  . . . , n,  )  f   is  defined  as 
^(-k-l'  ^k)  "'"'    '(n,,n2)- 

To  solve  the  equations  (*) ,  we  can  use  elimination  techniques 
(such  as  in  [AC],  [GW],  [RO2],  [SS]).  These  techniques  repeatedly 
eliminate  nodes  belonging  to  certain  flow  subgraphs  (control 
structures)  from  the  equations.   This  involves  extension  of  the 
data-flow  maps  involved  in  these  equations  to  account  for  flow 
through  the  reduced  structures.   Once  all  structures  have  been 
reduced,  the  equations  become  trivial  and  their  solution  immediate. 
This  final  solution  can  then  be  propagated  back  to  the  nodes 
previously  eliminated. 

The  algorithm  described  below  is  an  elimination  algorithm 
of  such  a  kind,  but  uses  the  special  structural  representation 
of  the  flow  graph  built  by  the  structural  analysis  described  in 
the  preceding  section.   We  proceed  as  follows: 
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I.  Iterate  through  the  nodes  in  STRUCTURES  (i.e.  from  innermost 
to  outermost);  let  s  be  the  structure  currently  being  processed. 

II.  We  compute  two  kinds  of  new  data-flow  maps:  (a)  for  each 

node  u  in  s,  an  auxiliary  map   f   describing  the  effect  on  data 

attributes  of  the  flow  from  entry  to  s,  through  s,  to  the  entry 

to  u;  (b)  for  each  successor  v  of  s ,  an  extended  data-flow  map 

f ,    >  describing  the  effect  on  data-attributes  of  flow  from 
(s,v)  ^ 

the  start  of  s,  through  s,  to  the  start  of  v. 

The  computation  of  these  maps  depends  on  t  =  STRUCTYPE (s) , 
as  follows: 

(1)    If  t  denotes  a  classical  control  structure  ( ' IFTHENELSE ' , 
'BLOCK',  'WHILELOOP'  etc.)  then  we  can  use  explicit  formulae 
of  Rosen's  type  to  calculate  these  maps.   As  examples,  consider 
the  following  two  typical  cases: 

Suppose  that  t  is  'IFTHENELSE'.   Let  n,,  n^,    n^   be  the  nodes 
belonging  to  this  structure,  and  let  v  be  the  (only)  successor  of  s, 


^1 


n„cr-     On, 
2  t       /   3 

V 

V 
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Then  we  have 


A 

\- 

id 

,/\ 

f 

^2 

^(nj^,n2) 

y\ 

s  = 

(n^^/n^) 

^(s,v)  =  (^(n2,v)  °   %)  ^  (f(n3,v)  °  ^^ 

Suppose  that  t  is  'WHILELOOP '  :   Let  n^^,  n^   be  the  nodes  of  this 
structure  and  let  v  be  the  (only)  successor  of  s. 


►0  "l 


Then  we  have 


-O 
^2 


f^   =  id   A  (f 
1 


0 

V 


/_  y^    \    °    '^  f-^      V,  \  )  (due  to  idempotenc 
(n2,n,)     (n,,n2)  ^ 


n2      (n^,n2)     n^ 


^(s,v)   -  ^(n^,v)  °  ^n^ 


(2)    Suppose  that  t  is  'PROPINT'.   Then  the  auxiliary  maps  are 
computed  by  iterating  twice  through  the  nodes  of  s  (in  their 
reverse  postorder)  applying  the  formulae 


f..  =   A  (f 


u 


(w,u)  °    ^w'  (^'■^)  ^  <^  ^^'^      STPUCTOF(w)  =  s},  u  7^  h 


^Yi   ^  ^  ^     ^'^^^(w  h)  °  ^w*  ^^''^^  ^  ^^  ^^^    STRUCTOF(w)  =  s}) 


-22- 


where  h  is  the  entry  node  of  s.   The  extended-flow  maps  can  then 
be  computed  using  the  formula 

f,    ,  =  A  {f ,    >  0  f  :  (w,v)  e  G  and  STRUCTOF(w)  =  s} 
(s,v)        (w,v)     w 

(3)  Suppose  that  t  is  ' PROPOUTINT ' ,  i.e.  that  s  is  a  general 

single-entry  acyclic  graph.   Then  the  auxiliary  maps  are  computed 

by  iterating  once  through  the  nodes  of  s  (in  reverse  postorder) , 

starting  with 

f,   =  id     (where  h  is  the  entry  to  s)  , 
h    — 

and  then  applying,  for  each  u  5^  h  in  s, 

f   =  A  (f  ,    ,  0  f  :  (w,u)  G  G  and  STRUCTOF (w)  =  s} 
u        (w,u)     w 

The  extended-flow  maps  are  then  computed  as  in  the  preceding 
case  (2)  . 

(4)  If  t  is  an  improper  interval  type,  then  we  iterate  re- 
peatedly through  the  nodes  of  s  (in  reverse  postorder) ,  calculating 
the  auxiliary  maps  using  the  formulae  as  in  case  (2)  or  (3)  above 
till  convergence  is  attained,  and  then  compute  the  extended-flow 
maps  as  in  cases  (2)  and  (3) . 

III.   At  the  end  of  steps  I  and  II,  the  graph  with  which  analysis 

started  will  have  been  reduced  to  a  single  node  (structure)  s. 

We  then  set   x   =  x  ,  where  x  s  l  denotes  the  null  data-state 
so         o 

to  be  assumed  at  the  start  of  execution. 
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IV.    Finally  iterate  once  more  through  the  nodes  in  STRUCTURES, 
this  time  in  reverse  order  (from  outermost  to  innermost) .   Let 
s  be  the  structure  currently  processed.   Then  for  all  nodes  u 

in  s  we  set  x   =  f   (x  ) .   This  completes  our  algorithm. 

u    u    s  ^  ^ 

Remarks :  (1)   As  can  be  noted,  our  approach  trades  off  space 
for  time.   The  auxiliary  maps  are  not  really  needed  during 
phases  I  -  II,  and  are  computed  and  stored  during  these  phases 
mainly  to  facilitate  the  propagation  phase  IV,  which  becomes 
trivial  once  these  maps  are  available.   However,  if  space  is 
at  premium,  then  we  need  not  save  the  auxiliary  maps  at  phase 
II,  but  can  compute  them  directly  in  phase  IV.   However  this 
will  increase  the  execution  time  of  our  algorithm,  since  the 
computation  of  the  extended-flow  maps  and  the  auxiliary  flow 
maps  for  a  given  structure  usually  overlap  considerably,  so 
that  in  phase  IV  we  may  repeat  a  significant   portion  of  the 
computations  performed  in  phase  II. 

(2)    Adapting  and  extending  the  techniques  of  [SS],  we  can 
devise  similar  algorithms  to  handle  the  various  other  kinds  of 
bitvectoring  analysis  useful  for  program  optimization.   These 
include  intraprocedural  backward  analysis,  such  as  live-dead 
analysis,  in  which  information  is  to  be  propagated  in  the  reverse 
direction  of  execution  flow;  interprocedural  forward  and  backward 
analyses,  which  analyze  the  effects  of  interprocedural  flow  on 
attributes  of  global  variables  and  procedure  parameters;  code 
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motion  associated  with  forward  analysis  (in  which  code  is  n^.oved 
to  earlier  program  points) ,  or  with  backward  anslysis  (in  which 
code  is  moved  to  later  program  points;  also  known  as  'code  sinking') 
Moreover,  once  having  made  the  hierarchical  structure  of  a  flow 
graph  available,  we  can  also  perform  various  types  of  code 
motion  which  cannot  be  performed  when  only  the  interval  structure 
of  the  program  is  known.   For  example,  we  can  'hoist'  code  within 
an  ' if -then-else '  structure,  i.e.  move  common  code  from  the  two 
branches  of  such  a  structure  to  its  head.   Although  similar 
kinds  of  code  motion  can  also  be  performed  by  the  method  of  Morel 
and  Renvoise  [MR],  our  technique  allows  more  control  over  the 
motion  to  be  performed,  e.g.  we  can  move  code  even  though  it 
need  not  be  unconditionally  executed  after  the  point  to  which 
it  is  moved. 

(3)    The  data-flow  analysis  described  in  this  section  can 
obviously  be  applied  directly  to  a  parse-tree  representation  of 
the  program  to  be  analyzed.   The  hierarchical  flow  structure 
of  such  a  tree  is  derivable  almost  trivially.   This  is  essentially 
Rosen's  high-level  data-flow  analysis  technique  ([Ro-j^],  [R02]). 
Techniques  for  handling  explicit  'goto'  statements  are  noted 
in  [MSF].   Note  also  that    structures  of  the  general  interval 
types  will  not  appear  in  this  case,  but  that  structures  having 
'abnormal'  exits  will  appear  and  can  be  handled  in  much  the 
same  way  as  the  single-exit  structures.   This  approach  can  also 
be  generalized,  using  the  methods  of  [SS],  e.g.  to  handle  in- 
terprocedural  analyses  and  code  motion. 
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Let  us  now  indicate  why  the  data-flow  analysis  algorithm 
described  in  this  section  can  run  faster  than  the  corresponding 
interval-based  algorithm  of  [SS].   We  can  estimate  the  execution 
time  of  bitvectoring  data-flow  analysis  algorithms  by  the  number 
of  bitvector  operations  that  they  perform.   As  noted  in  [SS], 
each  functional  composition  then  takes  four  units  of  time,  and 
each  functional  meet,  as  well  as  each  functional  application, 
two  units  of  time . 

As  an  indication  of  the  key  factors  controlling  efficiency, 
consider  the  following  program  fragment 

while   A  do 
begin 

^1 


B 

n 


end 


and  call  it  I.   Standard  interval-based  analysis  iterates  twice 
over  the  nodes  of  I,  applying  the  following  formulae  (where  v  is 
the  successor  of  I) 


^A   =   ^ 
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'^A  =  ^A    ^    ^^(B  ,A)  °  ^B  ) 

n        n 

•  ^B^  =  ^(A,B^)  °  ^A 

^B.  =  f(B,  ,,B,)  "    h.     ,  ^    =    ^ ^ 

1       1-1   1        1-1 

^(I,v)  =   ^(A,v)  °  ^A 

This  involves   2n+2  compositions  and  1  meet,  taking  altogether 

8n  +  10  time  units. 

By  contrast,  our  structural  analysis  approach  will  first 

reduce   B,,...,B   to  a  'BLOCK'  B  and  then  reduce  I  as  a  'WHILELOOP'. 
1     '  n 

The  associated  data-flow  maps  computations  will  be  as  follows : 


fn    =  id 

1 

^B.   =  ^(B.  ,,B.)  °  ^B.  ,      ^    =    2 "^ 

1  1-1   1       1-1 

^(B,A)  =  ^(B„,A)  °  ^B^ 
n        n 

^(A,B)  "  ^(A,B^) 


f,  =  id   A 
A    


C^(B,A)   °  ^(A,B)) 


^B  =  ^(A,B)  °  ^A 

^(I,v)  =  ^(A,v)   °  ^A 

and  so  involve  only  n+3  compositions  and  1  meet,  altogether 
4n  +  14  time  units. 
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The  final  propagation  phase  is  equally  efficient  in  both 
approaches.   Although  in  the  structural  analysis  approach  one 
also  needs  to  propagate  attribute  data  to  B,  this  data  can  then 
be  transmitted  directly  to  B,,  avoiding  an  extra  propagation  step. 

Overall,  we  see  that  elimination  of  the  while-loop  nodes 
takes  roughly  twice  as  much  time  in  interval-based  analysis  as 
it  does  in  structural  decomposition-based  analysis. 

Remarks :  (1)   It  can  be  shown  that  such  savings  in  time  are 
obtained  only  in  connection  with  reduction  of  loops.   Reduction 
of  acyclic  structures  such  as  'BLOCK'  or  ' IFTHENELSE '  does  not 
gain  time  by  itself,  only  if  these  structures  are  imbedded  within 
a  loop. 

(2)  The  algorithm  given  in  this  section  can  lose  time  when 
computing  data-flow  maps  for  the  virtual  edges  added  to  the 
original  graph.  This  will  generally  happen  when  a  single  edge 
going  from  a  deeply  imbedded  structure  to  a  relatively  outer 
structure  induces  a  series  of  such  virtual  edges.   Nevertheless 
for  well-structured  programs  with  no  explicit  'goto'  or  'exit' 
statements  such  loss  will  not  occur. 

(3)  The  time  gain  demonstrated  above  becomes  even  more  significant 
for  data-flow  analyses  having  rapid   but  nonbitvectoring 
frameworks  ([ROp]). 
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(4)    The  above  observation  concerning  time  efficiency  suggests 
various  significant  improvments  of  standard  interval-based 
elimination  algorithms.   In  particular,  in  processing  an  interval 
I,  we  can  implicitly  reduce  all  the  nodes  within  I  to  a  single 
acyclic  structure  I'  (of  type  ' PROPOUTINT '  or  ' IMPROPOUTINT '  as 
in  section  2),  and  then  reduce  I'  to  I  as  a  self-loop.   This  will 
realize  much  the  same  saving  as  is  achieved  by  our  structural 
analysis,  since  it  will  avoid  double  iteration  through  the  nodes 
of  I. 

4  .  Conclusions 

The  approach  presented  in  this  paper  narrows  the  gap  between 
the  two  common  optimization  techniques:  that  which  uses  parse-tree 
structure  directly  ([Wu],  [Ro-,]),  and  that  which  uses  an  inter- 
mediate code  representation  as  input  to  the  optimization  phase 
[Al].   By  introducing  structural  analysis,  we  can  develop  a 
rather  uniform  program  representation  on  which  Rosen's  style  of 
bitvectoring  data-flow  analysis  can  be  performed  in  a  manner 
which  is  more  efficient  than  the  improved  interval  analysis  of 
[SS],  and  which  also  facilitates  additional  useful  types  of 
code  motion  and  which  has  all  the  benefits  listed  in  [Ro-,]. 
Appropriate  extensions  of  this  technique  to  interprocedural 
analysis,  backward  analysis  and  code  motion  enable  one  to  cope 
with  the  variety  of  practical  bitvectoring  data-flow  problems 
likely  to  occur  in  code  optimization. 
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More  complicated  data-flow  analysis  problems  encountered 
in  practical  code  optimization  either  belong   to  the  class  of 
'rapid'  analyses  defined  in  [ROp]  and  hence  can  be  performed 
using  essentially  the  technique  outlined  in  the  previous  section, 
or  else  deal  with  various  (non-boolean)  variable  attributes 
which  cannot  usually  be  performed  by  elimination  (analyses  of 
this  latter  class  are  constant  propagation,  type-checking, 
range-checking  etc.).   This  second  class  of  data-flow  problems 
can  be  performed  using  iterative  propagation  alcng  use-definition 
links  (cf.  [Te]) ,  which  can  in  turn  be  calculated  using  the 
classical  bitvectoring  'reaching  definitions'  analysis  [He]. 
Thus  these  problems  can  be  solved  in  a  manner  which  becomes  in- 
dependent of  the  program  representation  as  long  as  the  use-definition 
map  can  be  computed  from  that  representation. 

Overall,  we  conclude  that  for  the  analyses  considered  above 
the  flow  graph  representation  has  no  real  advantage  over  a  parse- 
tree  representation  of  the  program  being  analyzed.   This  suggests 
one  of  the  following  design  responses:  (i)  Discard  the  intermediate- 
level  code  representation  altogether,  thus  eliminating  an  un- 
necessary and  time-consuming  compiler  pass,  (ii)  Retain  the  in- 
termediate code  representation,  which  might  still  be  useful  for 
other  reasons  (more  uniform  and  front-end  independent,  more 
readable  and  easier  to  test  etc.).   If  this  latter  approach  is 
adopted,  the  structural  analysis  approach  outlined  in  this 
paper  can  be  used  to  support  structural  data-flow  analysis  with 
all  the  benefits  described  above. 
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